智能论文笔记

BMD-GAN: Bone mineral density estimation using x-ray image decomposition into projections of bone-segmented quantitative computed tomography using hierarchical learning

Yi Gu , Yoshito Otake , Keisuke Uemura , Mazen Soufi , Masaki Takao , Nobuhiko Sugano , Yoshinobu Sato

分类：计算机视觉

2022-07-07

我们提出了一种从普通X射线图像中估算骨矿物质密度（BMD）的方法。双能X射线吸收法（DXA）和定量计算机断层扫描（QCT）在诊断骨质疏松症方面具有很高的精度；但是，这些方式需要特殊的设备和扫描协议。测量X射线图像的BMD提供了机会筛查，这对于早期诊断可能有用。先前直接了解X射线图像和BMD之间关系的方法需要大型训练数据集，以实现高精度，因为X射线图像中的强度很大。因此，我们提出了一种使用QCT训练生成对抗网络（GAN）的方法，并将X射线图像分解为骨分割QCT的投影。提出的分层学习提高了定量分解小区域目标的鲁棒性和准确性。使用拟议的方法对200例骨关节炎评估，我们将其命名为BMD-GAN，在预测和地面真实DXA测量的BMD之间显示出Pearson相关系数为0.888。除了不需要大规模训练数据库外，我们方法的另一个优点是它的扩展性对其他解剖区域，例如椎骨和肋骨。

translated by 谷歌翻译

I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

Chandra Bhagavatula , Jena D. Hwang , Doug Downey , Ronan Le Bras , Ximing Lu , Keisuke Sakaguchi , Swabha Swayamdipta , Peter West , Yejin Choi

分类：自然语言处理

2022-12-19

Pre-trained language models, despite their rapid advancements powered by scale, still fall short of robust commonsense capabilities. And yet, scale appears to be the winning recipe; after all, the largest models seem to have acquired the largest amount of commonsense capabilities. Or is it? In this paper, we investigate the possibility of a seemingly impossible match: can smaller language models with dismal commonsense capabilities (i.e., GPT-2), ever win over models that are orders of magnitude larger and better (i.e., GPT-3), if the smaller models are powered with novel commonsense distillation algorithms? The key intellectual question we ask here is whether it is possible, if at all, to design a learning algorithm that does not benefit from scale, yet leads to a competitive level of commonsense acquisition. In this work, we study the generative models of commonsense knowledge, focusing on the task of generating generics, statements of commonsense facts about everyday concepts, e.g., birds can fly. We introduce a novel commonsense distillation framework, I2D2, that loosely follows the Symbolic Knowledge Distillation of West et al. but breaks the dependence on the extreme-scale models as the teacher model by two innovations: (1) the novel adaptation of NeuroLogic Decoding to enhance the generation quality of the weak, off-the-shelf language models, and (2) self-imitation learning to iteratively learn from the model's own enhanced commonsense acquisition capabilities. Empirical results suggest that scale is not the only way, as novel algorithms can be a promising alternative. Moreover, our study leads to a new corpus of generics, Gen-A-Tomic, that is of the largest and highest quality available to date.

translated by 谷歌翻译

Learning Locally, Communicating Globally: Reinforcement Learning of Multi-robot Task Allocation for Cooperative Transport

Kazuki Shibata , Tomohiko Jimbo , Tadashi Odashima , Keisuke Takeshita , Takamitsu Matsubara

分类：机器人

2022-12-06

We consider task allocation for multi-object transport using a multi-robot system, in which each robot selects one object among multiple objects with different and unknown weights. The existing centralized methods assume the number of robots and tasks to be fixed, which is inapplicable to scenarios that differ from the learning environment. Meanwhile, the existing distributed methods limit the minimum number of robots and tasks to a constant value, making them applicable to various numbers of robots and tasks. However, they cannot transport an object whose weight exceeds the load capacity of robots observing the object. To make it applicable to various numbers of robots and objects with different and unknown weights, we propose a framework using multi-agent reinforcement learning for task allocation. First, we introduce a structured policy model consisting of 1) predesigned dynamic task priorities with global communication and 2) a neural network-based distributed policy model that determines the timing for coordination. The distributed policy builds consensus on the high-priority object under local observations and selects cooperative or independent actions. Then, the policy is optimized by multi-agent reinforcement learning through trial and error. This structured policy of local learning and global communication makes our framework applicable to various numbers of robots and objects with different and unknown weights, as demonstrated by numerical simulations.

translated by 谷歌翻译

Hybrid Life: Integrating Biological, Artificial, and Cognitive Systems

Manuel Baltieri , Hiroyuki Iizuka , Olaf Witkowski , Lana Sinapayen , Keisuke Suzuki

分类：人工智能

2022-12-01

Artificial life is a research field studying what processes and properties define life, based on a multidisciplinary approach spanning the physical, natural and computational sciences. Artificial life aims to foster a comprehensive study of life beyond "life as we know it" and towards "life as it could be", with theoretical, synthetic and empirical models of the fundamental properties of living systems. While still a relatively young field, artificial life has flourished as an environment for researchers with different backgrounds, welcoming ideas and contributions from a wide range of subjects. Hybrid Life is an attempt to bring attention to some of the most recent developments within the artificial life community, rooted in more traditional artificial life studies but looking at new challenges emerging from interactions with other fields. In particular, Hybrid Life focuses on three complementary themes: 1) theories of systems and agents, 2) hybrid augmentation, with augmented architectures combining living and artificial systems, and 3) hybrid interactions among artificial and biological systems. After discussing some of the major sources of inspiration for these themes, we will focus on an overview of the works that appeared in Hybrid Life special sessions, hosted by the annual Artificial Life Conference between 2018 and 2022.

translated by 谷歌翻译

Location analysis of players in UEFA EURO 2020 and 2022 using generalized valuation of defense by estimating probabilities

Rikuhei Umemoto , Kazushi Tsutsui , Keisuke Fujii

分类：机器学习

2022-11-30

Analyzing defenses in team sports is generally challenging because of the limited event data. Researchers have previously proposed methods to evaluate football team defense by predicting the events of ball gain and being attacked using locations of all players and the ball. However, they did not consider the importance of the events, assumed the perfect observation of all 22 players, and did not fully investigated the influence of the diversity (e.g., nationality and sex). Here, we propose a generalized valuation method of defensive teams by score-scaling the predicted probabilities of the events. Using the open-source location data of all players in broadcast video frames in football games of men's Euro 2020 and women's Euro 2022, we investigated the effect of the number of players on the prediction and validated our approach by analyzing the games. Results show that for the predictions of being attacked, scoring, and conceding, all players' information was not necessary, while that of ball gain required information on three to four offensive and defensive players. With game analyses we explained the excellence in defense of finalist teams in Euro 2020. Our approach might be applicable to location data from broadcast video frames in football games.

translated by 谷歌翻译

MR4MR: Mixed Reality for Melody Reincarnation

Atsuya Kobayashi , Ryogo Ishino , Ryuku Nobusue , Takumi Inoue , Keisuke Okazaki , Shoma Sawa , Nao Tokui

分类：人工智能

2022-09-15

有一段漫长的历史，努力与我们周围的实体和空间探索音乐元素，例如Musique Concr \'Ete和Ambient Music。在计算机音乐和数字艺术的背景下，还设计了集中在周围物体和物理空间上的互动体验。近年来，随着设备的开发和普及，在扩展现实中设计了越来越多的作品，以创造这种音乐体验。在本文中，我们描述了MR4MR，这是一项声音安装工作，使用户可以在混合现实的背景下体验与周围空间相互作用产生的旋律（MR）。用户使用HoloLens，用户可以撞击周围环境中真实对象的虚拟对象。然后，通过遵循物体发出的声音并使用音乐生成机器学习模型进行随机变化并逐渐改变旋律的声音，用户可以感觉到其环境旋律“转世”。

translated by 谷歌翻译

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Keisuke Shirai , Atsushi Hashimoto , Taichi Nishimura , Hirotaka Kameko , Shuhei Kurita , Yoshitaka Ushiku , Shinsuke Mori

分类：自然语言处理 | 人工智能

2022-09-13

我们提出了一个名为“ Visual配方流”的新的多模式数据集，使我们能够学习每个烹饪动作的结果。数据集由对象状态变化和配方文本的工作流程组成。状态变化表示为图像对，而工作流则表示为食谱流图（R-FG）。图像对接地在R-FG中，该R-FG提供了交叉模式关系。使用我们的数据集，可以尝试从多模式常识推理和程序文本生成来尝试一系列应用程序。

translated by 谷歌翻译

Automatic detection of faults in race walking from a smartphone camera: a comparison of an Olympic medalist and university athletes

Tomohiro Suzuki , Kazuya Takeda , Keisuke Fujii

分类：计算机视觉 | 人工智能 | 机器学习

2022-08-24

自动故障检测是许多运动的主要挑战。在比赛中，裁判根据规则在视觉上判断缺点。因此，在判断时确保客观性和公平性很重要。为了解决这个问题，一些研究试图使用传感器和机器学习来自动检测故障。但是，与传感器的附件和设备（例如高速摄像头）相关的问题，这些问题与裁判的视觉判断以及故障检测模型的可解释性相抵触。在这项研究中，我们提出了一个用于非接触测量的断层检测系统。我们使用了根据多个合格裁判的判断进行训练的姿势估计和机器学习模型，以实现公平的错误判断。我们使用智能手机视频在包括东京奥运会的奖牌获得者中，使用了正常比赛的智能手机视频，并有意地走路。验证结果表明，所提出的系统的平均准确度超过90％。我们还透露，机器学习模型根据种族步行规则检测到故障。此外，奖牌获得者的故意故障步行运动与大学步行者不同。这一发现符合更通用的故障检测模型的实现。该代码和数据可在https://github.com/szucchini/racewalk-aijudge上获得。

translated by 谷歌翻译

RealTime QA: What's the Answer Right Now?

Jungo Kasai , Keisuke Sakaguchi , Yoichi Takahashi , Ronan Le Bras , Akari Asai , Xinyan Yu , Dragomir Radev , Noah A. Smith , Yejin Choi , Kentaro Inui

分类：自然语言处理

2022-07-27

我们介绍了Realtime QA，这是一个动态的问答（QA）平台，该平台宣布问题并定期评估系统（此版本每周）。实时质量检查询问当前世界，质量检查系统需要回答有关新事件或信息的问题。因此，它挑战了QA数据集中的静态，常规假设，并追求瞬时应用。我们在包括GPT-3和T5在内的大型语言模型上建立了强大的基线模型。我们的基准是一项持续的努力，该初步报告在过去一个月中提出了实时评估结果。我们的实验结果表明，GPT-3通常可以根据新的退休文档正确更新其生成结果，从而突出了最新信息检索的重要性。尽管如此，我们发现GPT-3倾向于在检索文件时返回过时的答案，这些文件没有提供足够的信息来找到答案。这表明了未来研究的重要途径：开放式域质量检查系统是否可以确定无法回答的案例，并与用户甚至检索模块进行通信以修改检索结果？我们希望实时质量检查能够刺激问题答案及其他问题的瞬时应用。

translated by 谷歌翻译

Description and Discussion on DCASE 2022 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques

Kota Dohi , Keisuke Imoto , Noboru Harada , Daisuke Niizumi , Yuma Koizumi , Tomoya Nishida , Harsh Purohit , Takashi Endo , Masaaki Yamamoto , Yohei Kawaguchi

分类：机器学习 | (统计)机器学习

2022-06-13

我们介绍了声学场景和事件的检测和分类的任务描述（DCASE）2022挑战任务2：“用于应用域通用技术的机器状况监控的无监督异常的声音检测（ASD）”。域转移是ASD系统应用的关键问题。由于域移位可以改变数据的声学特征，因此在源域中训练的模型对目标域的性能较差。在DCASE 2021挑战任务2中，我们组织了一个ASD任务来处理域移动。在此任务中，假定已知域移位的发生。但是，实际上，可能不会给出每个样本的域，并且域移位可能会隐含。在2022年的任务2中，我们专注于域泛化技术，这些技术检测异常，而不论域移动如何。具体而言，每个样品的域未在测试数据中给出，所有域仅允许一个阈值。我们将添加挑战结果和挑战提交截止日期后提交的分析。

translated by 谷歌翻译